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This paper considers non-negative integer-valued autoregressive processes where the autoregres- 
sion parameter is close to unity. We consider the asymptotics of this 'near unit root' situation. 
The local asymptotic structure of the likelihood ratios of the model is obtained, showing that 
the limit experiment is Poissonian. To illustrate the statistical consequences we discuss efflcient 
estimation of the autoregression parameter and efficient testing for a unit root. 
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1. Introduction 

The non-negative integer-valued autoregressive process of the order 1 (INAR(l)) was 
introduced by Al-Osh and Alzaid (1987) and Alzaid (1988) as a non-negative integer- 
valued analogue of the AR(1) process. Al-Osh and Alzaid (1990) and Du and Li (1991) 
extended this work to INAR(p) processes. Recently there has been a growing inter- 
est in INAR processes. Without going into details we mention some recent (theoreti- 
cal) contributions to the literature on INAR processes: Freeland and McCabe (2005), 
Jung, Ronning and Tremayne (2005), Silva and Oliveira (2005), Silva and Silva (2006), 
Zheng, Basawa and Datta (2006), Neal and Subba Rao (2007) and Drost, Van den Akker 
and Werker (2008a, 2008b). Applications of INAR processes in the medical sciences 
can be found in, for example, Franke and Seligmann (1993), Belisle et al. (1998) and 
Cardinal, Roy and Lambert (1999); an application to psychometrics in Bockenholt 
(1999a), an application to environmentology in Thyregod et al. (1999); recent appli- 
cations to economics in, for example, Bockenholt (1999b), Berglund and Brannas (2001), 
Brannas and Hellstrom (2001), Rudholm (2001), Bockenholt (2003), Brannas and Quoreshi 
(2004), Freeland and McCabe (2004), Gourieroux and Jasiak (2004) and Drost, Van den 
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Akker and Werker (2008c); and Ahn, Gyemin and Jongwoo (2000) and Pickands III and 
Stine (1997) considered queueing applications. 

This paper considers a nearly nonstationary INAR(l) model and derives its limit ex- 
periment (in the Le Cam framework). Our main result is that this limit experiment is 
Poissonian. This is surprising since limit experiments are usually locally asymptotically 
quadratic (LAQ; see Jeganathan (1995), Le Cam and Yang (1990) and Ling and McAleer 
(2003)) and even non-regular models often enjoy a shift structure (sec Hirano and Porter 
(2003a, 2003b)), whereas the Poisson limit experiment does not enjoy these two prop- 
erties. The result is indeed surprising since Ispany, Pap and van Zuijlen (2003a) estab- 
lished a functional limit theorem with an Ornstcin-Uhlenbeck limit process from which 
one would conjecture a standard LAQ-type limit experiment. On a technical level the 
proof of the convergence to a Poisson limit experiment is interesting, since the 'score' 
can be split into two parts that have different rates of convergence. To illustrate the 
statistical consequences of the convergence to a Poisson limit experiment, we exploit this 
limit experiment to construct efficient estimators of the autoregression parameter and 
to construct an efficient test for the null hypothesis of a unit root. Since the INAR(l) 
process is a particular branching process with immigration, this also partially solves the 
question (see Wei and Winnicki (1990)) of how to estimate the offspring mean efficiently. 
Furthermore, we show that the ordinary least squares (OLS) estimator, considered by 
Ispany, Pap and van Zuijlen (2003a, 2003b, 2005), is inefficient. Surprisingly, the OLS 
estimator even has a lower rate of convergence than the efficient one. Related to this, 
we show that the classical Dickey-Fuller test for a unit root has no power against local 
alternatives induced by the limit experiment. More precisely, as we will see below, the 
autoregressive parameter in these local alternatives is of the form l — h/n^ with h> and 
n denoting the number of observations. Of course, for alternatives at a further distance 
the Dickey-Fuller test will have power but the efficient test can perfectly discriminate 
between the null and the alternative in such 

An INAR(l) process (starting at 0) is defined by the recursion, Xq = 0, and, 

Xt=^oXt-i+et, teN, (1) 
where (by definition an empty sum equals 0), 

doXt^^=Y,zf. (2) 

Here {Zf'^)jm,t gN is a collection of i.i.d. Bernoulli variables with success probability 6 G 
[0, 1], independent of the i.i.d. innovation sequence (et)tgN with distribution G on Z-|_ = 
NU {0}. All these variables are defined on a probability space (fi, JF, Pe.c)- If we work 
with fixed G, we usually drop the subscript G. The random variable "doXt-i is called the 
binomial thinning of Xt^i (this operator was introduced by Steutel and van Harn (1979)) 
and, conditionally on A"t_i, it follows a binomial distribution with success probability 6 
and a number of trials equal to Xt-i. Equation (1) can be interpreted as a branching 
process with immigration. The outcome Xt is composed of d o Xt-i, the elements of 
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Xt-i that survive during {t — l,t\, and St, the number of immigrants during {t — 
Here the number of immigrants is independent of the survival of elements of Xt-i. 
Moreover, each element of Xt-i survives with probability 9 and its survival has no effect 
on the survival of the other elements. From a statistical point of view, the difference 
between the literature on INAR processes and the literature on branching processes with 
immigration is that in the latter setting one commonly observes both the X process 
and the e process, whereas one only observes the X process in the INAR setting, which 
complicates inference drastically. Compared to the familiar AR(1) processes, inference 
for INAR(l) processes is also more complicated, since, even if 6 is known, observing 
X does not imply observing e. From the definition of an INAR process it immediately 
follows that Eg^c[Xt\Xt-i, . . . ,Xo] = 9Xt-i +Eg£i, which (partially) explains the 'AR' 
in 'INAR'. It is well known that, if 6' e [0, 1) and KqSi < oo, which is called the 'stable' 
case, there exists an initial distribution, ve^Cy such that X is stationary if C{X()) = ve.G- 
Of course, the INAR(l) process is non-stationary if 9 =\: under Pi the process X is 
nothing but a standard random walk with drift on Z+ (but note that X is nondecreasing 
under Pi). We call this situation 'unstable' or say that the process has a 'unit root'. 
Although the unit root is on the boundary of the parameter space, it is an important 
parameter value since in many applications the estimates of 9 are close to 1. 

Denote the law of (Xo,...,X„) under fe,G on the measurable space (<%"„, ^„) = 
^^n+i^ 2^r') by P^"^. For G known, the global model of interest is thus (Pe,G I ^' S [0, 1]). 
The model restricted to the stable case 9 € [0, 1), has been shown to be locally asymptot- 
ically normal (LAN) in Drost, Van den Akker and Werker (2008b) and Section 4.3.2 in 
Van den Akker (2007). For this stable case, the OLS estimator is consistent and asymp- 
totically normal. The focus of interest of the present paper is, however, the unstable case 
9 = 1. Therefore we will introduce the local parameter h>Q and take the autoregressive 
parameter 0„ = 1 — h/n'^ in (2). This is formalized below. 

In our applications we mainly consider two sets of assumptions on G: (i) G is known 
or (ii) G is completely unknown (apart from some regularity conditions). For expository 
reasons, let us, for the moment, focus on the case that G is completely known and the 
goal is to estimate 9. We use 'local-to-unity' asymptotics to take the 'increasing statis- 
tical difficulty' in the neighborhood of the unit root into account, that is, we consider 
local alternatives to the unit root in such a way that the increasing degree of difficulty 
to discriminate between these alternatives and the unit root compensates the increase of 
information contained in the sample as the number of observations grows. This approach 
is well known; it originated from the work of Chan and Wei (1987) and Philips (1987), 
who studied the behavior of a given estimator (OLS) in a nearly unstable AR(1) setting, 
and Jeganathan (1995), whose results yield the asymptotic structure of nearly unsta- 
ble AR models. Following this approach, we introduce the sequence of nearly unstable 
INAR experiments £n{G) = (<%"„, yl„, (P["'';,/„2 | h > 0)), n e N. The 'localizing rate' 
will become apparent later on. It is surprising that the localizing rate is , since for the 
classical nearly unstable AR(1) model one has rate n^/n (non-zero intercept) or n (no 
intercept). Suppose that we have found an estimator /i„ with 'nice properties'; then this 
corresponds to the estimate 0„ = 1 — /i„/n^ of 9 in the global experiment of interest. 
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To our knowledge, Ispany, Pap and van Zuijlen (2003a) were the first to study esti- 
mation in a nearly unstable INAR(l) model. These authors study the behavior of the 
OLS estimator and they use a localizing rate n instead of . However, is the proper 
localizing rate and, in Proposition 4.3, we show indeed that the OLS estimator is an ex- 
ploding estimator in (f„(G'))„gH, that is, it has not even the 'right' rate of convergence. 
The question then arises how we should estimate h. Instead of analyzing the asymptotic 
behavior of a given estimator, we derive the asymptotic structure of the experiments 
themselves by determining the limit experiment (in the Le Cam sense) of (f„(G'))„gN- 
This limit experiment gives bounds to the accuracy of inference procedures and suggests 
how to construct efficient ones. 

The main contribution of this paper is to determine the limit experiment of (£'„(G'))„gN- 
Remember that (see, e.g., Le Cam (1986), Le Cam and Yang (1990), Van der Vaart 
(1991), Shiryaev and Spokoiny (1999) or Van der Vaart (2000) Chapter 9), the sequence 
of experiments {£n{G))n&i is said to converge to a limit experiment (in Le Cam's weak 
topology) £ = {X,A, {Qh \h>0)) if, for every finite subset / C M+ and every ho £ R+, 
we have 



dIPi-ho/„2 Uei V^^'^n / hei 

To see that it is indeed reasonable to expect as the proper localizing rate we briefly 
discuss the case of geometrically distributed innovations (in the remainder we treat gen- 
eral G). In case G = Geometric(l/2), that is, G puts mass (l/2)*'+i at fc S Z+, it is an 
easy exercise to verify for h> (the geometric distribution allows us, using Newton's 
binomial formula, to obtain explicit expressions for the transition probabilities from Xt-i 
toXt if Xt>Xt_i), 



(^Aq, . . . , Aq 



dp(") 



0, if % ^ 0, 



/iG(0)Eg£i\ ( h 



exp ( ) = exp ( — 7 ) , if — — *■ 1 , under . 



9 
71^ 



1 -r 

1, II ^ oo. 



4/' n2 

n 

7i 



This has two important implications: First, it indicates that n is indeed the proper 
localizing rate. Intuitively, if we go faster than we cannot distinguish Pj^j'^y^, from 

P^"', and if we go slower we can distinguish Pi"'^/^ perfectly from Pj""*. Second, since 

cxp(— /i/4) < 1 we have, by Le Cam's first lemma, no contiguity of Pij'/j/„2 with respect 



-h/r-, 
It } 

contiguity.) This lack of contiguity is unfortunate for several reasons. Most important, if 



to P^"'. (Remark 2 after Theorem 2.1 gives an example of sets that yield this non- 
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we had contiguity the hmiting behavior of (dP!j^_^^^^2/dP'[" would determine the Umit 

experiment, whereas we now need to consider the behavior of (dP|^"^^^^2/dP^"^^^yjj2);ie/ 
for all /iQ > 0. And it implies that the global sequence of experiments does not have the 
common LAQ structure (see Jeganathan (1995)) at 6* = 1. This differs from the traditional 
AR(1) process Fq = Oj = M + ^^t-i + ""t, ut i.i.d. N(0, cr^), with ^ and cr^ known, 
that enjoys this LAQ property at 9 = \: the limit experiment at 9=1 is the usual 
normal location experiment (i.e., the model is LAN) and the localizing rate is n^/^. The 
limit experiment aX 9 = 1 for Yq = 0, = 9Yt-i + Mt, Ut i.i.d. N(0, cr^), with ct^ known, 
does not have the LAN structure; the limit experiment is of the locally asymptotically 
Brownian functional type (a special class of LAQ experiments; see Jeganathan (1995)) 
and the localizing rate is n. Thus although the INAR(l) process and the traditional AR(1) 
process both are walks with drift &t 9 = 1, their statistical properties 'near 9 = V are very 
different. In Section 3 we prove that the limit experiment of (£'„(G))„gN corresponds to 
one draw from a Poisson distribution with mean /iG'(0)Eg£i/2, that is, 

£(G) = (z„2-^ (poisson(^^»£i) I /. > O)) . 

We indeed recognize exp(— ft,G(0)EGei/2) as the likelihood ratio at h relative to /iq = 
in the experiment £{G). Due to the lack of enough smoothness of the likelihood ra- 
tios around the unit root, this convergence of experiments is not obtained by the usual 
(generally applicable) techniques, but rather by a direct approach. Since the transition 
probability is the convolution of a binomial distribution with G and the fact that certain 
binomial experiments converge to a Poisson limit experiment, one might be tempted to 
think that the convergence £n{G) £{G) follows, in some way, from this convergence. 
As is clear from the proof of Theorem 3.1 this reasoning is not valid. 

The remainder of the paper is organized as follows: In Section 2 we discuss some pre- 
liminary properties that provide insight into the behavior of a nearly unstable INAR(l) 
process and are needed in the rest of the paper. The main result is stated and proved in 
Section 3. Section 4 uses our main result to analyze some estimation and testing prob- 
lems. We consider efficient inference of /i, the deviation from a unit root, in the nearly 
unstable case for two settings. The first setting, discussed in Section 4.1, treats the case 
that the immigration distribution G is completely known. The second setting, analyzed 
in Section 4.2, considers a semi-parametric model, where hardly any conditions on G 
are imposed. Furthermore, we show in Section 4.1 that the OLS estimator is explosive 
under the local alternative 0„ = 1 — h/'n? . Finally, we discuss testing for a unit root in 
Section 4.3. We show that the traditional Dickey-Fuller test has no (local) power, but 
that an intuitively obvious test is efficient. Appendix A contains some auxiliary results 
and Appendix B gathers some proofs. 



2. Preliminaries 



This section discusses some basic properties of nearly unstable INAR(l) processes. Be- 
sides giving insight into the behavior of a nearly unstable INAR(l) process, these proper- 
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ties are a key input in the next sections. To enhance readabihty the proofs are organized 
in Appendix B. 

First, wc introduce the following notation: The mean of St is denoted by jiQ and 
its variance by a^. The probability mass function corresponding to G, the distribution 
of the innovations et, is denoted by g. The probability mass function of the binomial 
distribution with parameters 9 e [0, 1] and n e Z+ is denoted by hnfi- 

Given Xf-i = Xt-i, the random variables St and "d o Xt-i are independent and 
Xt-i — -d o Xt-i, 'the number of deaths during {t— follows a binomial (Xj _ i, 1 — 0) 
distribution. This interpretation yields the following representation of the transition prob- 
abilities, 

Pl-i,xt = ^e{Xt = xt\Xt-^ = xt-i} 

= ^ '¥B{et=Xt-Xt-i + k,Xt^i~doXt-i^k\Xt-i ^Xt-i} 

k=0 

Xt-1 

= ^ hxt-ui-e{k)g{Axt + k), 

k=0 

where Axt = Xt — Xt-i, and g{i) = for i < 0. Under Pi we have Xt = /ic^ + ~ 
fie), and = g{Axt), Xt-i,Xt € Hence, under Pi, an INAR(l) process is noth- 

ing but an integer- valued random walk with drift. 

The next proposition is basic, but often applied in the sequel. 

Proposition 2.1. If a q < oo, we have for h>0, 

2 

= 0. (3) 

If ctq < oo, then we have for a > and every sequence (^ri)neN "in [0, 1], 

n 

lim ——Y^e„X^ = 0. (4) 

t=l 

Remark 1. Convergence in probability for the case /i > in (3) cannot be concluded 
from the convergence in probability in (3) for /i = by contiguity arguments. The reason 
is (see Remark 2 after the proof of Theorem 2.1) that Pij'^/„2 is not contiguous with 
respect to Pi"'' . 

Next, we consider the thinning process (t?o Xt_i)(>i . Under Pi_/j/„2 , Xt-i — -do Xt-i, 
conditional on a binomial(Art_i, h/n^) distribution follows. So we expect that, near 

unity, many 'deaths' do not occur in any time interval (i — 1, t] . The following proposition 



lim El. 



h/r. 



t=l 
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gives a precise statement where we use the fohowing notation: For h>0 and n e N, 



^^^^<l]: ^::={(xo,...,x„.i)ga:jx...xa:j}. (5) 



2 



These sets are introduced for the following reasons: By Proposition A.l we have for 
X & J2k=r^x,h/n^{k) < 2hx,h/n^{r) for r = 2, 3 and terms of the form (1 - 
can be bounded neatly without having to make statements of the form 'for n large 
enough', or having to refer to 'up to a constant depending on h\ Also, recall the notation 

Proposition 2.2. If cTq <oo, then we have for all sequences {0n)nem in [0,1] and for 
all h>0, 

\ii-nFg^{At) = l. (6) 
Moreover, if g'q < oo and h>0, we have, 

lim Pi„,,/„2{3t e {1, . . . , 71} : At_i - I? o At_i > 2} = 0. (7) 

n — >oo 

Finally, we derive the limit distribution of the number of downward movements of X 
during [0,7?.]. The probability that the Bernoulli variable l{AXf < 0} equals one is small. 
Intuitively, the dependence over time of this indicator process is not too strong, so it is 
not unreasonable to expect that a 'Poisson law of small numbers' holds. As the following 
theorem shows, this is indeed the case. 

Theorem 2.1. If < oo, then we have for h>0, 

l{AXt<0}^ Poisson Tender Pi_,/„2. (8) 

Remark 2. Since l{^^t < 0} equals zero under Pj"' and converges in distribution 

to a non-degenerated limit under Pij'/j/„2 (/i > 0, < g(0) < 1), we sec that Pi"''/j/„2 is 
not contiguous with respect to P^"^ for h>Q. 



3. The limit experiment: one observation from a 
Poisson distribution 

For easy reference, we introduce the following assumption. 

Assumption 3.1. A probability distribution G on Z+ is said to satisfy Assumption 3.1 
if one of the following two conditions holds: 
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(1) Support(G) = {0, . . . , M} for some M e N; 
4 



(2) Support(G) — (Tq < oo and g is eventually decreasing, that is, there exists 



M €N such that g{k + 1) < g{k) for k > M. 

The rest of this section is devoted to the following theorem. 

Theorem 3.1. Suppose G satisfies Assumption 3.1. Then the limit experiment of 
(fn(G))„gN is given by 

£(G) = (Z+,2^+,(Q^|/i>0)), 
with Qfi ~ Poisson(/i(7(0)/^G/2). More precisely, for h>0 and ho > we have, under 



])(n) 

-(Xo,.. ., A„) > — — (Z) =cxp 



jp(") 
"^l-/io/n 



{h~ho)g{0)fiG\f h_ 
. ho 



(9) 



(n) 



while for h>0 and ho ~0 we have, under Pj^' 

-{Xo,...,Xn) — >— — (Z)=exp 



Proof. Introduce for h, ho > 0, 



hg{0)^ic 



1{Z = 0}. 



(10) 



P 



P 



-/lo/n 



l_/io/„2 t=l ^ Xt-l,X, 



Note, if ELi < 0} > and ho > 0, that £„(0, /iq) = -oo and thus dP[,"VdP^"^^^/„2 = 0. 

Because Cn{h, ho) is complicated to analyze, split the transition probability Pxt^J.xt into 
a leading term. 



Ln{xt-l,Xt,h) = < 



and a remainder term. 



XI \yxt-uh/n^{k)g{^xt+k), ifA2;t<0, 

fc=-Axt 
1 

5Zbx._i,Vn^(fc)3(^2;t + fc), if Aa;t>0, 



fc=0 



Rn{xt-l,Xt,h) = < 



X! b;rt_i,/v«2(fc)5(^2;t + fc), ifAxt<0, 

fc=-A3;t+2 

bx,_i,Vn^ (fc).9(A.Tt + fc), if Axt > 0. 



fc=2 
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The difference between £„(/i,/io) and £„(/i, /iq) is negligible. 

Lemma 3.1. If G satisfies Assumption 3.1, then we have for h>0 and Hq > 0, 

£n{h, ho) = Cn{h, ho) + o(Pi_;,,,/„2; 1). (11) 

To enhance readability the proof of the lemma is organized in Appendix B. Hence, 
the limit distribution of the random vector {Cn{h,ho))h£i , for a finite subset / C K+, 
is the same as the limit distribution of /io))/ig/. It easily follows, using (7), that 

C„{h,ho) can be decomposed as 



Cn{h,ho) = ^ 



t=i 



Xt-i - 2 , /I - h/n' 

— 2 — log 1 — rTi 



+ s+{hM) 



+ 5,-(/i,/io)+o(Pi_,,„/„2;l), 



(12) 



where S+{h, ho) = J2t:AXt>o and S„ {h, ho) = J2t:AXt=-i ^tn are defined by (here 
St AXt=-i shorthand for X]i<t<n AXt=-i' and for X]t AXt>o ^^^^ same convention is 
used), 



W+ = log 



W,- = log 



j{AXt){l - h/nY + Xt-i{h/n^){l - h/n^)g{AXt + 1) 



9(AXt){l - ho/n^)^+Xt-i{ho/n^){l - ho/n^)g{AXt + 1)_ 
X,_i(/iV)(l - h/n^)g{0) + (Xi_i(X,_i - l)/2)(/iVn4)g(l) 



Xt^i{ho/n^)il - ho/n^)giO) + (Xt_i(Xt_i - l)/2)ihl/n'^)gil) _ 
So we need to determine the asymptotic behavior of the terms in (12). By (3) we have, 

l-h/n'^y] 1 -A. . p ih-ho)fiG 

2^[Xt-i - 2) ^ 



log 



1 - ho/n^ 



under Pi„,j„/„2. (13) 



The next lemma yields the behavior of S^{h, ho), the second term of (12); see Appendix B 
for the proof. 



Lemma 3.2. // G satisfies Assumption 3.1, then we have for h>0 and ho > 0, 

Q+fu u \ P (h- ho){l- g{0))lJG , n 

5+(/i,/io) — > ^^-r — , under Vi^ho/n^- 



(14) 



306 F.C. Drost, R. van den Akker and B.J.M. Werker 

Finally, we discuss the term S~{h,ho) in (12). Under Pi this term is not present, so 
we only need to consider ho > 0. We organize the result in the following lemma; see 
Appendix B for the proof. 

Lemma 3.3. If G satisfies Assumption 3.1, then we have for ho>0 and h>0, 



Sn {h,ho)=log 



h 
ho 



^l{AXt<0} + o(Pi_^„/„2;l), (15) 



where we set log(O) = — oo and log(O) -0 = 0. 

To complete the proof of the theorem, note that we obtain from Lemmas 3.1-3.3, (12) 
and (13): 



Cn {h, ho) = Cnih, ho) + o(Pi_,jo/„2 ; 1) 
{h- ho)giQ)nG 



log 



h' " 



ho 



J2l{AXt<0} +o{F,_f,^/„2;l), 



t=i 



where we interpret log(O) = — oo, log(O) -0 = and log(/i/0) J2t=i ^{^-^t < 0} = when 
ho = 0, h> 0. Hence, Theorem 2.1 implies that, for a finite subset / C M+, 

{Cn{h,ho))hei (log-^^{Z)) , under Pi_,j„/„2 , 

which concludes the proof. □ 



Remark 3. Li the proof we have seen that, 

1 — /^O /'"' 



h 

ho 



^l{AXt<0} + o(Pi_^„/„2;l). 



So, heuristically, we can interpret X]"=i l{^^t < 0} as an 'approximately sufhcient statis- 
tic'. 



4. Applications 

This section considers the following applications as an illustration of the statistical con- 
sequences of the convergence of experiments. We discuss the efficient estimation of /i, the 
deviation from a unit root, in the nearly unstable case for two settings. The first setting, 
discussed in Section 4.1, treats the case that G is completely known. And the second 
setting, analyzed in Section 4.2, considers a semi-parametric model, where hardly any 
conditions on G are imposed. Finally, we discuss testing for a unit root in Section 4.3. 



Nearly unstable INAR 



307 



4.1. Efficient estimation of h in nearly unstable INAR models 
{G known) 

In this section G is assumed to be known. So we consider the sequence of experiments 
(f„(G'))„gN. As before, we denote the observation from the Hmit experiment £{G) by Z , 
and Q/i = Poisson(/i(7(0)/XG/2). 

Since we have estabhshed convergence of (£„(G))„gN to £{G), an appUcation of the Le 
Cam- Van der Vaart asymptotic representation theorem yields the following proposition. 

Proposition 4.1. Suppose G satisfies Assumption 3.1. If (T„)„gN is a sequence of es- 
timators of h in the sequence of experiments {£niG))neN such that /!(T„|P]^_;j/„2) — > Zh 
for all h>0, then there exists a map t:Z-f x [0, 1] ^ M such that Zh = C(t{Z,U)\Qh x 
Uniform[0, 1]) (i.e., U is distributed uniformly on [0,1] and independent of the observa- 
tion Z from the limit experiment £{G) ). 

Proof. The sequence {£n{G))ne¥i converges to the experiment £{G) (Theorem 3.1). Since 
£{G) is dominated by counting measure on Z+, the result follows by applying the Le 
Cam-Van der Vaart asymptotic representation theorem (see, e.g.. Van der Vaart (1991), 
Theorem 3.1, or Van der Vaart (2000), Theorem 9.3). □ 

Thus, for any set of limit laws of an estimator there is a randomized estimator in the 
limit experiment that has the same set of laws. If the asymptotic performance of an esti- 
mator is considered to be determined by its sets of limit laws, the limit experiment thus 
gives a lower bound to what is possible: Along the sequence of experiments you cannot 
do better than the best procedure in the limit experiment. To discuss efficient estimation 
we need to prescribe what we judge to be optimal in the Poisson limit experiment. Often 
a normal location experiment is the limit experiment. For such a normal location exper- 
iment, that is, estimate h on the basis of one observation Y from N(ft,,r) (r known), it 
is natural to restrict to location-cquivariant estimators. For this class one has a convolu- 
tion property (see, e.g., Bickel et al. (1998), Van der Vaart (2000) or Wong (1992)): the 

law of every location-cquivariant estimator T of ft, can be decomposed as T = Y -\-V , 
where V is independent of Y . This yields, by Anderson's lemma (see, e.g., Lemma 8.5 in 
Van der Vaart (2000)), efficiency of Y (within the class of location-equivariant estima- 
tors) for all bowl-shaped loss functions. To be more general, there are convolution results 
for shift experiments. However, the Poisson limit experiment £{G) does not have a nat- 
ural shift structure. In such a Poisson setting it seems reasonable to minimize variance 
amongst the unbiased estimators. See Ling and McAleer (2003) for a similar approach 
for LAQ limit experiments. 

Definition An estimator h for h is called efficient in the experiment £{G) if it is 

unbiased, that is, Ehh — h for all h>0, and minimizes the variance amongst all unbiased 
(randomized) estimators of h. 



The next proposition is an immediate consequence of the Lehmann-Scheffe theorem. 
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Proposition 4.2. 7/0 < g{0) < 1 and fxc < oo, then 2Z/ g{0)fiQ is an efficient estimator 
of h in the experiment £{G). The variance of this estimator, under Qh, is given by 
2h/g{0)fiG. 

A combination with Proposition 4.1 yields a variance lower bound to asymptotically 
unbiased estimators in the sequence of experiments (i£'„(G))„gN- 

Corollary 4.1. Suppose G satisfies Assumption 3.1. If (Tn)neN *s an estimator of h in 
the sequence of experiments (£n(G'))„gN such that C{Tn\Pi-h/n^) ~* ^/i with J zdZfi(z) = 
h for all h>0, then we have 

(z - dZh(z) > '^^ for all h > 0. (16) 

.9(0)AiG 

It is not unnatural to restrict to estimators that satisfy £(T'„|P]^_^y„2) —>■ Z^,. We make 
the additional restriction that J zdZh{Z) = h, that is, the limit distribution is unbiased. 
Now, based on the previous proposition, it is natural to call an estimator in this class 
efficient if it attains the variance bound (16). To demonstrate the efficiency of a given 
estimator, one only needs to show that it belongs to the class of asymptotically unbiased 
estimators, and that it attains the bound. How should we estimate hi Recall, that we 
interpreted X)"=i l{^^t < 0} approximately sufficient statistic for h. Hence, it is 

natural to try to construct an efficient estimator based on this statistic. Using Theo- 
rem 2.1 we see that this is indeed possible. 



Corollary 4.2. If Assumption 3.1 holds, then the estimator, 

- 2n^,l{AX,<0} 
5(0)mg 

is an efficient estimator of h in the sequence {£niG))niETi- 



(17) 



Finally, we discuss the commonly used OLS estimator when 6'„ = 1 — /i/n^. Rewriting 

Xt = -&oXt-i+et = fiG + dnXt^i+ut for ut = et- hg +'&° Xt-i- dnXt-\, we obtain the 
autoregression Xt — Hg = ^nXt-i +ut, which can also be written as n^{Xt — Xt-i — ^j^g) = 
h{—Xt-i) + n^ut (note that indeed Eg^^ut = Ee„ Xt-iUt = 0). So the OLS estimator of On 
is given by 

- F^n 772 ' ^^^1 

and the OLS estimator of h is given by 



K = v^n ^ = " (1 - ^« ) 



' E"=i Xt-i{Xt — Xt-i — hg) 
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Ispany, Pap and van Zuijlen (2G03a) showed that 71^/2 (^OLs _ _^ jsj^q, cr^) under 
for 7„ = 1 — /i„/n, /i„ ^ 7, and cr^ depending on 7. This means that the OLS estimator 
can be used to distinguish alternatives at rate n. Since the convergence of experiments 
takes place at rate , the OLS estimator deteriorates under the localizing rate n? . 

Proposition 4.3. IfKcSi < 00, then we have for all h > 0, 

\hn^^\ 00, under Pi_/j/„2. 

Remark 4- A similar result holds for the OLS estimator in the model where G is 
unknown. 

Thus the OLS estimator cannot distinguish local alternatives at rate ; at lower rates 
(up to it is capable of distinguishing alternatives. In this sense it does not have the 

right rate of convergence. 

4.2. Efficient estimation of h in nearly unstable INAR models 
{G unknown) 

So far we have assumed that G is known. In this section, where we instead consider a 
semi-parametric model, we hardly impose conditions on G (see, e.g., Bickel and Kwon 
(2001) or Wefclmcyer (1996) for general theories on semi-parametric stationary Markov 
models, and Drost, Klaassen and Werker (1997) for group-based time series models). 
The dependence of upon G is made explicit by adding a subscript: Pe,G- Formally, we 
consider the sequence of experiments, 

f„ = (Z;+\ 2^+^\ (Pi'l\/„, I {h, G) e [0, cx)) X g)), n e N, 

where G is the set of all distributions on Z+ that satisfy Assumption 3.1. 

The goal is to estimate h efficiently. Here efficient, just as in the previous section, means 
asymptotically unbiased with minimal variance. Since the semi-parametric model is more 
realistic, the estimation of h becomes more difficult. As we will see, the situation for our 
semi-parametric model is quite fortunate: we can estimate h with the same asymptotic 
precision as in the case where G is known. In semi-parametric statistics this is called 
adaptive estimation. 

The efficient estimator for the case where G is known cannot be used anymore, since 
it depends on g(0) and /ic- The obvious idea is to replace these objects by estimates. 
The next proposition provides consistent estimators. 

Proposition 4.4. Let h>Q and G satisfy cTq < 00. Then we have 

1 " X 
gn{0) = -y^l{Xt = Xt-i} ^ g{Q) and ficn = — ^ l^^G, under Pi^h/n'.G- 
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Proof. Notice first that we have 
1 " 

-V(Xt_i-i^oXt_i)^0, under Pi_;,/„2,G, (19) 
thus condition on Xt-i and use (3), 



< - ^ IEl-h/„^G(^t-l - ^ o Xt-i) - — ^ Ei_^/„2,GXt_i 

t=l ^ t=l 



0. 



Using that \l{Xt = Xt-i} — l{et = 0}| = 1 only if Xt-i — o Xt-i > 1, we easily obtain 
by using (19), 



5n(0) - - V l{et - 0} < - V l{Xt_i - o > 1} < - - 1? o Xt_i) ^ 

n -"^ — ' n ^ — ' n ^ — ' 

t=i t=i t=i 



0. 



Now the result for 5„(0) follows by applying the weak law of large numbers to 
Sr^i l{^t ~ Next, consider fi.G,n- We have, using (19) and the weak law of large 
numbers for SILi ^t' 

Y 1 " 

■n n ^ — ' 

t=i 

n 1 ^ 

= "X!^* -i?oXt_i) -^piG, under Pi-/i/„2^g, 

i=l t=l 

which concludes the proof. □ 
From the previous proposition we have /i„ — /i„ — ^ 0, imder Fi-h/n^.Gi where 

5„(0)/iG,n 

This implies that estimation of h in the semi-parametric experiments (£'„)ngN is not 
harder than the estimation of h in (£„(G'))„gN- In semi-parametric parlor: The semi- 
parametric problem is adaptive to G- The precise statement is given in the following 
corollary; the proof is trivial. 

Corollary 4.3. // {Tn)neN is o, sequence of estimators in the semi-parametric sequence 
of experiments (£„)„gN such that >C(r„|P]^_/j/„2 g) Z^ q with J zdZ^.d^) = h for all 
{h,G) e [0,00) X G, then we have 

[z - hf dZh.G[z) > for all {h, G) € [0, 00) x Q. 

The estimator hn satisfies the conditions and achieves the variance bound. 
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4.3. Testing for a unit root 

This section discusses testing for a unit root in an INAR(l) model. We consider the 
case where G is known and satisfies Assumption 3.1. We want to test the hypothesis 
Ho : = 1 versus Hi : < 1. Hehstrom (2001) considered this problem from the perspective 
that one wants to use standard (i.e., OLS) software routines. He derives, by Monte Carlo 
simulations, the finite sample null distributions for a Dickey-Fuller test of a random walk 
with Poisson distributed errors. This (standard) Dickey-Fuller test statistic is given by 
the usual (i.e., non-corrected) t-test that the slope parameter equals 1, that is. 



qOLS 



- 1 



where 0^^^ is given by (18). Under Hq, that is, under Pi, we have r„ — ^N(0, 1). To 
analyze the power of this test, and since f„(G) ^{G), we consider the performance 
of r„ along the sequence £n{G). The following proposition shows, however, that the 
asymptotic probability that the null hypothesis is rejected remains a for all alternatives. 
Obviously, this does not exclude power of the Dickey-Fuller test under local alternatives 
at rate n"^/^ (which indeed it has). 

Proposition 4.5. IfEosf < oo, we have for all h>0, 

T„— ^N(0, 1), under Fi_fi/n2, which yields lim Pi_/i/„2 (reject Hq) = a. 



n — >QO 



Proof. From Ispany, Pap and van Zuijlen (2003a) the result easily follows. □ 
We propose the intuitively obvious tests 



if ^l{AXt<0} = 0, 

n 

if ^l{AXt<0}>l, 



that is, reject Hp if the process ever moves down and reject Ho with probability 
a if there are no downward movements. We will sec that this obvious test is, in 
fact, efficient. To discuss the efficiency of tests, we recall the implication of the Le 
Cam- Van der Vaart asymptotic representation theorem to testing (see Theorem 7.2 in 
Van der Vaart (1991)). Let a G (0, 1) and 0„ be a sequence of tests in (£n(G))nGN such 
that limsup„^Q^ Ei0„(Xo, . . . , Xn) < a. Then wc have 

limsupEi_,,/„2(/)„(Xo,...,X„)< sup E/i0(Z) for aU /i > 0, 
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where $q, is the collection of all level a tests for testing Ho : /i = versus Hi : /i > in the 
Poisson limit experiment £{G). If we have equality in the previous display, it is natural 
to call a test ^„ efficient. It is obvious that the uniform most powerful test in the Poisson 
limit experiment is given by 



0(Z) 



a, if Z ^ 0, 
1, ifZ>l. 



Its power function is given by Eo0(2') = a and Kji(f>{Z) = 1 — (1 — a) cxp{—hg{0)fic/2). 
Using Theorem 2.1 we find 

lim Ei?/'„(^o, ■ • = a 

n—*oo 

and 

lim Ei_h/n^MXo,...,Xn) = l-{l-a)exp(-^^^^^^] for /i > 0. 

n — >oo \ Z J 

We conclude that the test ijjn is indeed efficient. 

Appendix A: Auxiliaries 

The following result is basic (see, e.g.. Feller (1968), pages 150-151), but since it is heavily 
applied, we state it here for convenience. 

Proposition A.l. Let m e N, p £ (0, 1). If r > mp, we have 

'yhm.p{k)<h„^,p{r)- -■ (20) 

^ — ' r — mp 

k=r ^ 

So, if 1> mp, we have for r = 2, 3, 

m 

X]bm,p(fc) <2b,„,p(0- (21) 

For convenience we recall Theorem 1 in Serfling (1975). 

Lemma A.l. Let Zi,...,Z„ (possibly dependent) 0-1 valued random variables and set 
Sn = X)"=i^t- ^ Poisson distributed with mean X]t=i^^t- Than we have 

n n 

sup \F{Sn e A} - F{Y e A}\ < V(EZt)2 + VE|E[Zt|Zi, . . . , Zt-i] - EZt\. 
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Proof of Proposition 2.1. Obviously Vari(^"^-^ Xt) = O(n^) and lim„_>txj "-"^ x 
X]"=i I^i^t = y^cl"^-, which yields (3) for ft, = 0. Next, we prove (3) for /i > 0. Straightfor- 
ward calculations show, for Q <\, 



n " 1 - 0* 

which yields 

1 " 
lim ^Ei_,j/„2 Y]Xt 



gn+1 



lim ^ 

Ti— *oo 77. 



2 ■ 



h/n? 



1 - - [1 - (n + + ((n + l)n/2)h'^/n^ + o(l/n^)] 



(1) 



To treat the variance of n ^X]"=i^ti '^^ '^^^ the following simple relations; see also 
Ispany, Pap and van Zuijlen (2003a), for 0<6'<1, s,i>l. 



Var«X, 



Var«X 



^t.Asi 



1 



q2t 



From this we obtain, as n — * oo, 



(2) 



1-02 



1-612 



1 



1 



< -2n(crg + 



1 



Vari_,,/„2 Xi 



n2 l-(l-ft/n2)2n^ 



•0. 



Together with (1) this completes the proof of (3) for ft > 0. To prove (4), note that 
Xt < J2l=i ^i - Hence Eg^Xf < ^iXf = aQt + ^Qt^, which yields the desired conclusion. □ 

Proof of Proposition 2.2. Equation (6) easily follows since, for a sequence {On)neTi in 
[0,1], (4) implies 



1 " 



as n—>oo. 



(3) 
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To obtain (7) note that, for Xt^i G we have, usmg the bound (21), 
By (4) this yields. 

Km Pi_,,/„2({3te{l,...,n}:Xt_i-i?oXt_i>2}n^^)< hm — VEi.fty,,^^^., = 0. 

Since we aheady showed hm„^oc- Pi-Zi/n^ = 1, this yields (7). □ 

Proof of Theorem 2.1. If g(0) = 0, then AX* < implies Xt-\ --do Xt-i > 2. Hence, 
(7) implies J2t=i H^Xt < 0} ^ under Pi_ fty„2 . Since the Poisson distribution with 
mean concentrates all its mass at 0, this yields the result. The cases /i = or g{0) — 1 
(recall Xq = 0) are also trivial. So we consider the case h > and < 17(0) < 1. For 
notational convenience, abbreviate Pi_;i/„2 by P„ and Ei_/j/„2 by E„. Put Zt = l{AXt = 
-1, et = 0} and notice that < l{AXt <0}-Zt = l{AXt < -2} + l{AXt = -l,et > 1}. 
From (7) it now follows that 

n n n 

0<Y^ l{AXt < 0} - ^ Zt < 2 ^ l{Xt_i - 7? o Xt-i > 2} ^ 0, under P„. 
t=i t=i t=i 

Thus it suffices to prove that J2t=i^t Poissoii{hg{0)iiQ/2) under P„. We do this 
by applying Lemma A.l. Introduce random variables Yn, where F„ follows a Poisson 
distribution with mean A„ = ^"^-[^E„Zt. And let Z follow a Poisson distribution with 
mean hg{0)fiQ/2. From Lemma A.l we obtain the bound 

sup P„Jy ZtG aI -Pr{r„e A} 

n n 
< (E„ Ztf+Y^n\En[Zt~EnZt\Zi,...,Zt-l]\. 
t=l t=l 

If we prove that 

(i)x:(iE.^*r-o, (ii)x:]E«^*-M^, 

n 

(iii) y E„ |E„ [Z, - E„ Zt I Zi , . . . , _ 1] H , 
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all hold as n ^ oo, then the result follows since we then have for all z <E ' 



. t=i 



< 



, t=i 



Pr{Yn <z}- Pr(Z <z)\^0. 



First wc tackle (i). Using that, conditional on Xt-i, e* and X-t-i — "doXt-i ^ Binxj_i,h/n2 
being independent, we obtain 



h 



E,,Zt=¥„{et = 0,Xt-i-doXt-i = 1} = -^E„Xt-i 1 ^ 



Then (i) is easily obtained using (4), 



Xt-i-l 



hm V(E„Z02< hm ^i^VE„X2_,=0. 



Next we consider (ii). If wc prove the relation, 

1 " 1 " f h 



lim 

n— »-oo 



0, 



it is immediate that (ii) follows from (3). To prove the previous display, wc introduce 
S„ = {Vi e {1, . . . , 71} : < n^/-*} with lim„^oo Pn(S„) = 1 (see (3)). On the event Bn 
we have n^^Xt < n^^/"' for t = 1, . . . , n. This yields 



0<E„Xt_i( 1- ( 1- — 



h 
n' 



Xt-i-l 



< KnXt-l ( 1 - ( 1 - — 



h 

n' 



Xt-i 



1b,. +E„Xt_ilj 



< 



Xt-i 



3 = 1 



wv,-.i:(Y)(^ 



-E„Xt_ilBe < — exp(/i)E„Xt_i +E„Xt_il, 



Using P„(i?„) — > 1 and (3) we obtain. 



1 " / 1 " 

lim ^ VE„Xt_ilse < lim . E„ — V 

> \ t^l 



P„(i?c) = ^/ ( ^ ) .0 = 0. 



By (4) we have lim„^oo X]t=i ^""''^t-i Combination with the previous two 

displays yields the result. 

Finally, we prove (iii). Let = [Ti)t>i and = {J^^)t>o be the filtrations gen- 
erated by (et)t>i and {Xt)t>o, respectively, that is, = a{ei, . . . ,et) and = 
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(7(Xo, . . . , Xt). Note that we have, for t>2, 
E„|E„[Zt-E„Zt|Zi,...,Zt_i]| 



(4) 



Xt-iil- 



Xt-i-l 



-E„Xt-i{l- 



Using the reverse triangle inequahty we obtain 

Xt-i-l 



Xt-i I 1 - — 



h 



E,,Xt-i{ 1- — 



Xt-i-l 



En\Xt-l-EnXt-l\ 



Xt-i{l-{ 1 



<E. 



<2E„Xt_i( 1- I 1- ^ 



Xt-i-l 



Xt-i-l 



EnXt-l 1-1 



Xt-i-1 



We have aheady seen in the proof of (ii) that 



lim ^ VE„Xt_ifl- (l 



n— *oo Jl 



t=l 



h 



Xt-i-1 



0. 



A combination of the previous two displays with (4) now easily yields the bound 

hg{Q) " 



^ E„ |E„ [Zt - E„Zt . . . , I < 0(1) + ^ \A^a^ 



t-i- 



(5) 



t=i t=i 

From (2) we have for 6* < 1, Var^Xt < (cr^ + /iG)(l - 0^^){1 - e^)-\ And for 1 < t < n 
we have < 1 — (1 — < n^^ exp(2/i). Now we easily obtain 



1 1 / cxp(2/i) 

— TiA/ >0 asn^oo. 



(1 — hjn^Y' ^ 

A combination with (5) yields (iii). This concludes the proof. 

Proof of Lemma 3.1. We obtain, for /i > 0,/io > 0, using the inequality |log((a 
6)/(c + d)) — log(a/c)| <b/a + d/c for a, c> 0, 6, d > 0, the bound 



□ 



\r (h h \ r (h h -.^ Rn{Xt-i,Xt,h) Rn{Xt-i,Xt,ho) 

\C,.ih, ho) - C^h, ho)\ < ^ ^^(^^_^^^^^^) + L4X,_^,X„ho) ^'-'"'^ 



2-a.s. 



(6) 
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It is easy to see, since bn,o(fc) = for fc > and g{i) = for z < 0, that for /lo > 0, 
£„(0,/io) and £„(0, /iq) both contain log(O) exactly Y^^=i^{^^t < 0} times. Also for 
H^^t < 0} = we have 

|/:„(o, M - A.(o, Ml < E 



Thus if wc show that 



ERn{Xt-l,Xt,h') p J m 

^^^-^^—^-^0, under P,_.„/„. 



holds for h' = h and h' = ho, the lemma is proved (exclude the case h' — and ho > 0, 
which need not be considered). We split the expression in the previous display into four 
non-negative parts (empty sums are by definition equal to 0) 

E Rn{Xt-l,Xt,h') ^ s—^ Rn{Xt_i,Xt,h') s—^ Rn{Xt-i,Xt,h') 



^^L„iXt^i,Xt,h') ^J^^_^Lr,{Xt^i,Xuh') ^^^j^^_^LM~uXt,h') 

ERn{Xt-l,Xt,h') s—y Rn{Xt-l,Xt,h') 

TJYrT~x717) 2^ 



t:0<AX,<M LniXt-l,Xt,h') ^.^^^^^j L4Xt-l,Xt,h') 



Since AXt < -2 imphes Xt_i - -do Xt^i > 2, (7) implies 



t:AXt<-2 



ERn{Xt-l,Xt,h') p 1 m 

^-^^—^^0, under P,_,„/„. 



Next we treat the terms for which AXt = — 1- If /lo = 0, we do not have such terms (under 
Pi-Zio/n^) and remember that the case h' = and ho> need not be considered. So we 
only need to consider this term for h' ,ho > 0. On the event (see (5) for the definition 
of this event), an application of (21) yields, 

Rn{Xt^i,Xt,h') ^ T.kLli"^Xt-uh' /n-^{k) 

^ ^ {XUmU^/n^){l - h'ln^Y^-^-^ ^ 
- ^ 5(0)Xt_i(/iVn2)(l-/i'/n2)^t-i-i ^ -r 



< 



3g(0)n4 
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since (1 - h'/n^)-'^ < 4 by definition of y^J^'. From (4) and (6) we now obtain 



L„(X,_„X,/.0^°' under F,_,„/„. 



t:A,Yt = -l 

Next, we analyze the terms for which < l^Xt < M. We have, by (21), on the event A'f^ , 

^ R„{Xt-i,Xuh') ^ ^ Efl2' bx,_i,fe7n^(fc)g(AXt + k) 
^ LJXt^i,Xt,h') - ^ g(AXt)hx, . h'/uAO) 

t:Q<AXt<M ' ^' ^ t:0<AXt<M tjuxt-i.ti/n \ J 

^ 2 y-v bxt_iJi'/n2(2) 



4fe'^ ^ 2 



where to* = nun{g(fc) | < fc < M} > 0. Now (4) and (6) yield the desired convergence. 



t:0<AXf<J\/ 



ERniXt-l,Xt,h') p 
: ^ > 0, under Fi _f,„ /„2 



Finally, we discuss the terms for which AXt > M. If the support of G equals {0, . . . , M}, 
there are no such terms. So we only need to consider the case where the support of G is 
Z+. Since g is non-increasing on {M, M + 1, . . .}, we have, by (21), Rn{Xt-i,Xt, h') < 

h'/r, 



2g{AXt)hx,-^,h'/n<'2) for Xt-i € A'^ , which yields, 



RM.uXuh') ^ 2giAXt){Xl,/2)h'ynHl-h'/n^)^'-^-^ 
- Ln{Xt-i,Xuh') - g(AXt){l-h'/n^)^*-^ 

4/j'2 ^ ^, 

< — —Xt^i, Xt^ieA„. 

From (4) and (6) it now easily follows that we have 

ERn{Xt-i,Xt,h') p J m 

^-^^—^^0, under P,_,„/„.. 

This concludes the proof of the lemma. □ 

Proof of Lemma 3.2. We write S'+(/i, h^) ~ X]t AXt>o ^§[1 + f^tnl; where 

'h — ha — 



U+ = g{AXt) 



h-hr 



-Xt-ig{AXt + l) 



n 



2 



X ( 5(AX0 ( 1 - ^ ) ' + X,_ig(AX, + 1):^^ fl - - " ' ' 
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Notice that, for n large enough, 



h — ho /i — Hq 



4x -1 



x(»^(AX,(l-i| 

for some constant C, where we used that e i-^ (7(e + l)/(7(e) is bounded. From (4) we 
obtain 

C " 

n— *oo ' — ^ n— »oo ' 71^ — ^ 



t:AXt>0 



Hence 



C/+2^0, under Pi 

t:AXt>0 

hm Pi_;.o/„2/ max JC/+ I < I/2I = 1 



and 



t:A.Yt>0 



(7) 



Using log(l + x) — X + r{x), where r satisfies \r{x)\ < for \x\ < 1/2, we obtain from 
(7), 

S+ih,ho)= J2 log[l + C^tlJ= E t/++o(Pi_„„/„.;l). 



t:AXt>0 



t:AXt>0 



Thus the problem reduces to determining the asymptotic behavior of X]t AXt>o ^tn- Note 
that. 



E E 



t:AX,>0 



X,_ig(AXt + l)[{h - /io)/n2 - {h^ - 
, g(AXt)(l - /io/n2)2 + Xi_i5(AXt + l)(/io/n2)(l - /lo/n^ 



t:AXt>0 ■ 

+ o(Pi_,,„/„2;l). 
Using that e 1-^ g{e + l)/(7(e) is bounded and (4), we obtain 

Xt-ig{AXt + mh - ho)/n^ - (h^ - /ig)/^^] 



E 

t:AXt>0 



g{AXt)il - /io/n2)2 + Xt^.g^AXt + l)iho/n^)il - ho/n^) 
{h~ho)Xt-ig{AXt + l) 



C 



< — E ^t-i ^ 0, under Pi„,,„/„2 



t=i 
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Thus the previous three displays and (7) yield 

S+{h,ho) = ^^^f]X,_i^^^|^l{AXi >0,Xt_i-^oX,_i < l}+o(Pi_,,„/„.;l). 

Finally, we will show that 

4 E > 0, - ^ o < 1} 

g(AXt) 



t=i 

p (l-5(0))AiG 



under Pi_,i^/„2, (8) 



2 

which will conclude the proof. For notational convenience we introduce 

= - 7? o = 0} + -^M-l{e, > 1, - ^ o = 1}. 

Using that et is independent of Xt-i — ?? o Xt-i, we obtain 

Ei_,,o/„2 [Zt - d o = (1 - g{0))l{Xt-i - d o = 0} 

+ - 7? o = l}E^^^^l{et > 1}, 

5(£t-l) 

where we used that ¥.g{ei + l)/g(£i) = 1 — (7(0) and Eljsi > l}g{ei)/ g{ei — 1) < 00, since 
we assumed that g is eventually decreasing. So we have 

Zt-¥.^.h„/n^[Zt\Xt-l-'&oXt.l] 

T-^ IE — — l{At_i - 1? o = 0} 

From this it is not hard to see that we have 

Ei-/io/„2Xt_i(Zt -¥.i_ho/n^[Zt\Xt-i --doXt-i]) = 0, 
Ei-/io/„2Xt_i(Zt -Ei_,,„/„2[Zt|Xt_i - 79oXt_i]) 

X Xs-i{Z, - ¥.^_ho/n-[Zs\Xs^i -^oXs^i]) = for s < 

¥.^^no/n-{Zt'V.^-ho/n-[Zt\Xt-l-doXt-l]f <C, (9) 
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for C ~ 2(Var(5(ei + l)/g{ei)) + Var(l|£j>i}g(ei)/(7(ei — 1)), which is finite by Assump- 
tion 3.1. Thus, by (4), it follows that 

1 " 

= —Y,^l-ho/n-Xl,{Zt-E,_,,^/„2[Zt\Xt-l-doXt-l]f 



t=l 

Hence (8) is equivalent to 

^ 2^ Xt-l^l-ha/n-^ [Zt\Xt-l - if o Xt-i\ > , 

t=l 

Since, by (4), 



under Pi_,,„/„2. (10) 



1 " / " / 

— Y,^l-h„/n-Xt_^l{Xt_^ - d O = 1} = ^Ei_„„/„2X2_^ 1 - 

- :^^'^^-ho/n''-Xt^^^Q, 
t=l 

we have, using (7), 



Xt-1-1 



t-1 



< 



E^^^l{e*>l}-(l-g(0)) 



1 " 

— Xt-il{Xt-i - I? o Xt-i = 1} 



g{£t-i) 

+ ^ ~ y Xt_il{Xt_i - o Xt-1 > 2} ^ 0, under 

i=l 

We conclude (10), which finally concludes the proof of the lemma. 



-ho/r, 



□ 



Proof of Lemma 3.3. First we consider h = 0. From the definition of 5,7(0, ft-o) we 
see that 5*^(0, /iq) = if l{^^t < 0} = (since an empty sum equals zero by 

definition). And if X^tLi ^{^Xt < 0} > 1, we have S^{0, ho) = — oo (since W^^ = — oo for 
h = 0). This concludes the proof for h = 0. 
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So we now consider h> 0. We rewrite 



= log 



h f 1- h/n^ \ Xt-i - 1 h'^g{l) 



hQ\l-hQ/n^J 2n2 g{Q)ho{l - ho/n^) 

-It 



X 1 + 



Xt-i ~ 1 feog(l) 
2n2 5(0)(1-V"') 

By (7), the proof is finished if wc show that 



E 

t:AXt = -l 



W^n - log 



h 

ho 



0, under Pi_,,^/„2. 



Using the inequality | log((a + b)/{c + d)) — log(a/c)| < b/a + d/c for a, c > 0, b,d>0, we 
obtain 



- log 



h 
ho 



< 



< 



Xt-i - 1 ' 
2n2 



g{0)ho{l - ho/iv^) \ho \1 - ho/n 



+ 0(n-2) 
h f 1- h/n^ 



hog{l) 



giO)il-ho/n^ 



+ 0(n-2). 



Hence, it suffices to show that 



^ ^^t_2^Q^ under Pi _,,^/„2. 



t:AA't = -l 

Note first that we have, by (7), 







^ n 1 

^ — E^*-il{^^* = = — ^^t-ll{AXt = = 0} + o(Pi_„„/„2; 1). 



We show that the expectation of the first term on the right-hand side in the previous 
display converges to zero, which wiU conclude the proof. We have, by (4), 



1 

lim — VEi_„„/„2Xt_il{AXt = -l,£i = 0} 



n^oo Ji 



n— ►oo — ' \ 77,^ 



< lim : — 2^ti_,i(,/„2At_i = 0, 
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which concludes the proof of the lemma. □ 
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